Fix #815: deduplicate SessionConnector::watch() emissions with debouncing by AlexMikhalev · Pull Request #858 · terraphim/terraphim-ai

AlexMikhalev · 2026-04-30T10:55:52Z

Summary

Add per-path HashMap<PathBuf, (usize, Instant)> tracking in NativeTerraphim AIConnector::watch()
Only emit session when messages.len grew AND 250 ms debounce window elapsed since last emission
Document dedup contract on the SessionConnector::watch trait method
Fix clippy collapsible_if in the emission guard
Tighten integration test: add lower-bound assertion and narrow upper bound from 5 to 4

Test plan

cargo test -p terraphim_sessions — all 54 tests pass including test_watch_dedup_on_incremental_writes
cargo clippy -p terraphim_sessions — clean
cargo fmt -p terraphim_sessions -- --check — clean

Refs #815

🤖 Generated with Terraphim AI

… + AutoMerge enqueue' (#717) from task/roc-v1-step-f-verdict-polling into main

…Refs adf-fleet#35

…impl — Refs adf-fleet#35

…ostMergeTestGate enqueue + [ADF] issue on failure) — Refs adf-fleet#35

…andle_auto_merge — Refs adf-fleet#35

…ath — Refs adf-fleet#35 Add 5 integration tests driving handle_auto_merge_for_project with an in-memory RecordingExecutor (no mock frameworks, async-trait impl): * auto_merge_success_enqueues_post_merge_gate * auto_merge_skipped_when_head_sha_changed * auto_merge_failure_opens_adf_issue * auto_merge_marks_dedupe_set_on_success * auto_merge_skipped_when_pr_already_closed Expose pub auto_merge_enqueued() accessor on AgentOrchestrator so tests can verify the (project, pr_number, head_sha) dedupe set is populated after a successful merge.

… from task/roc-v1-step-g-auto-merge-handler into main

…ait + config — Refs adf-fleet#36 Adds crates/terraphim_orchestrator/src/post_merge_gate.rs with: - CommandRunner async trait + TokioCommandRunner real impl (streams stdout/stderr tails via bounded ring buffer so long test runs do not OOM; kills child on wall-time timeout). - run_workspace_tests / classify_failure / revert_merge helpers. - ScriptedRunner test double (no mock library — trait impl only). - 9 inline unit tests covering runner behaviour, failure parsing, and revert paths (green, timeout, io-error, test-failure parse, harness detection, timeout classification, revert no-push, revert with push, revert all-paths-failed). Adds PostMergeGateConfig to OrchestratorConfig (optional; defaults to 10 minute budget, push revert to origin main). Back-fills the new field on every existing OrchestratorConfig initializer across lib.rs test fixtures and integration tests.

…k - Refs adf-fleet#36 Replaces the Step B log-only stub for DispatchTask::PostMergeTestGate with a real handler that: - Resolves project working_dir as repo_root - Runs post_merge_gate::run_workspace_tests with GateConfig built from orchestrator.toml [post_merge_gate] overrides (default 10 min budget) - On green: logs post_merge_gate_verified at info - On red: classifies failure, calls post_merge_gate::revert_merge to create revert commit + push to remote, opens [ADF] post-merge test gate reverted issue on the project's Gitea repo with merge_sha, revert_sha, top failing tests, stderr tail, wall_time - Logs post_merge_gate_reverted at warn handle_post_merge_test_gate_with_runner<R: CommandRunner + ?Sized> is parametrised so integration tests can inject ScriptedRunner without spawning cargo. Adds truncate_for_issue helper for char-boundary-safe trimming of stderr tails when opening the [ADF] issue (avoids corrupting UTF-8 mid-sequence). All orchestrator tests pass; clippy clean; fmt clean.

) from task/roc-v1-step-h-post-merge-gate into main

…/merge — Refs adf-fleet#37 Add OrchestratorEvent enum (PrReviewed, PrAutoMerged, PrAutoMergedVerified, PrAutoReverted) to quickwit.rs with emit_event helper on QuickwitFleetSink. All emits are gated behind the quickwit feature and tolerate Quickwit down (warn log, business logic unblocked). Wire emit sites: - PrReviewed in poll_pending_reviews_for_project after verdict parse - PrAutoMerged in handle_auto_merge_for_project after merge_pr succeeds - PrAutoMergedVerified in handle_post_merge_test_gate_with_runner on green - PrAutoReverted in handle_post_merge_test_gate_with_runner on revert Replace the two quickwit_event_placeholder log lines from Step H with the typed emits above. Add tests/quickwit_events_tests.rs: 4 field-mapping tests + 1 tolerance test (all gated behind quickwit feature). 501 tests pass; clippy clean; fmt clean. Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>

…e flow (ROC v1 Step I)' (#726) from task/roc-v1-step-i-quickwit-events into main

…reviewers event-driven) — Refs adf-fleet#39 No template change needed: implementation-swarm is not yet in scripts/adf-setup/agents/. Developer cron bump to */20 applies to live conf.d/*.toml during Step L rollout.

…ump (ROC v1 Step K)' (#727) from task/roc-v1-step-k-cron-extension into main

…lo -> terraphim -- Refs adf-fleet#40 Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>

…728) from task/roc-v1-step-l-rollout-runbook into main

Updates the adf.rs provider registration from Kimi K2.5 to K2.6 so the ProviderHealthMap probe + KeywordRouter fallback target the current model. KG routing tables (planning_tier.md) and conf.d/digital-twins.toml already reference kimi-for-coding/k2p6; this aligns register_providers. Tests unchanged (test fixtures use synthetic k2p5 strings that remain valid — provider_key_for_model is generic and C1 allow-list checks the kimi-for-coding prefix, not the specific model string).

…tion to k2p6' (#734) from task/kimi-k2p6-provider into main

…rand - Patch rustls-webpki from 0.103.10 to 0.103.12 (fixes name constraints bypass) - Add RUSTSEC-2026-0098/0099 to deny.toml (serenity/discord-feature-only) - Replace rand with WASM-compatible fastrand in terraphim_multi_agent and terraphim_kg_agents - Remove two direct rand dependencies from workspace crates Refs #630

…on (#818) * feat(cli): add evaluate subcommand for automata ground-truth evaluation Wire the existing evaluate() function to a CLI subcommand in terraphim_cli. Changes: - Add Evaluate command with --ground-truth and --thesaurus flags - Add handle_evaluate() function using terraphim_automata::evaluate() - Add 4 integration tests for evaluate command - Wire Evaluate match arm in command dispatcher The core evaluation logic was already implemented in terraphim_automata::evaluation (~613 lines, 13 unit tests). This adds CLI integration for automation use. Example usage: terraphim-cli evaluate --ground-truth gt.json --thesaurus th.json Part of: Gitea #576 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(clippy): replace sort_by with sort_by_key for Rust 1.95 compatibility Rust 1.95 promotes clippy::unnecessary_sort_by to hard error under -D warnings. Convert all sort_by calls to sort_by_key across 3 crates: - terraphim-markdown-parser: 1 change (descending sort with Reverse) - terraphim_router: 1 change (descending sort with Reverse) - terraphim-session-analyzer: 13 changes (ascending + descending) Line 548 in reporter.rs retains sort_by with #[allow] due to fallible string parsing in the key function. Refs #576 * fix(clippy): fix remaining sort_by in session-analyzer main.rs and submodules Missed in previous commit: session-analyzer has duplicated logic in main.rs (binary target) and submodules (kg/search, patterns/loader) that also use sort_by. Convert to sort_by_key where possible, add #[allow] for float comparisons using partial_cmp. Refs #576 * fix(clippy): workspace-wide sort_by -> sort_by_key for Rust 1.95 compatibility Convert all remaining sort_by calls across 40 files to either sort_by_key or #[allow(clippy::unnecessary_sort_by)] for cases with non-Copy types, multi-line closures, or partial_cmp on floats. Covers: terraphim_agent, terraphim_automata, terraphim_orchestrator, terraphim_service, terraphim_persistence, terraphim_update, terraphim_usage, terraphim_sessions, terraphim_cli, terraphim_mcp_server, terraphim_types, terraphim_symphony, terraphim_tinyclaw, terraphim_multi_agent, terraphim_agent_evolution, terraphim_agent_registry, terraphim_goal_alignment Refs #576 * fix(clippy): fix Rust 1.95 lints in task_decomposition and rolegraph examples - Remove unnecessary .into_iter() in extend() call (useless_conversion lint) - Collapse if guards into match arms (collapsible_match lint) - Allow explicit_counter_loop in rolegraph examples Refs #576 * fix(clippy): allow collapsible_match lint in middleware and agent_evolution Rust 1.95 clippy promotes collapsible_match to hard error under -D warnings. Add #![allow] at file level for ripgrep.rs, orchestrator_workers.rs, and parallelization.rs where collapsing the match arms would reduce readability. Refs #576 * fix(clippy): allow collapsible_match in goal_alignment/goals.rs Refs #576 * fix(ci): pin Rust toolchain to 1.94.0 to match local dev dtolnay/rust-toolchain@stable installs latest (1.95.0) which has new clippy lints (collapsible_match, unnecessary_sort_by, useless_conversion) not present in 1.94. Pin all ci-pr.yml jobs to 1.94.0 and update rust-toolchain.toml accordingly. Refs #576 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Cherry-pick from local main (0115f06f). Refs #612 Co-authored-by: Test User <test@test.com>

Pre-build at script line 98 ran cargo build --workspace --all-targets without --features zlob. fff-search build.rs panics under CI when zlob isn't enabled (intentional gate). Clippy step at line 112 already had the flag; pre-build needed it too. Unblocks lint-and-format CI for PR #818 and any future PR. Refs #818 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

openai/* via the OpenAI Plus/Pro/Team plan is subscription-gated, not pay-per-use. PR #631 (the P0 C1 fix) removed openai from the allow-list under the mistaken belief that all openai/* was pay-per-use; this commit restores it. Session-token exhaustion (the temporary state that motivated #631 in the first place) is already handled correctly by the orchestrator's existing primitives: - ProviderHealthMap probes every probe_ttl_secs (1800s / 30 min) and marks exhausted providers unhealthy. Routing falls through to the next KG route (anthropic/opus -> kimi-for-coding/k2p6 -> openai/gpt-5.4 -> zai-coding-plan/glm-5). - CircuitBreaker opens on 5 consecutive failures, half-opens after cooldown, closes on successful probe. - ProviderBudgetTracker hour+day tumbling UTC windows; force_exhaust() on throttle classification pushes past cap so routing drops until the next window rolls. - error_signatures.rs classifies stderr as Throttle / Flake / Unknown per provider with configurable regex. Changes: - config.rs: add "openai" to ALLOWED_PROVIDER_PREFIXES. - config.rs tests: include openai/gpt-5.4 and openai/gpt-5.3-codex in test_is_allowed_provider_c1_allowed_prefixes. - provider_gate_tests.rs: rename probe_gate_rejects_openai_provider -> probe_gate_allows_openai_subscription_provider; invert assertion and document the session-token-tracking rationale. - docs/taxonomy/routing_scenarios/adf/planning_tier.md: retain anthropic/opus as primary route, add kimi-for-coding/k2p6 and openai/gpt-5.4 as alternate routes; remove the previously-present legacy openai/gpt-5.4 entry that had been stripped as a C1 violation. Tests: all 497 unit tests + 16 integration tests pass. Clippy + fmt clean.

…st + KG routing' (#737) from task/kg-planning-tier-k2p6 into main

Replace the in-memory DeviceStorage backend with MarkdownLearningStore for durable persistence of shared learnings across process restarts. Changes: - Expand markdown frontmatter to serialize full SharedLearning state - Add backwards-compatible parsing for old sparse frontmatter files - Replace DeviceStorage with MarkdownLearningStore in SharedLearningStore - Implement real load_all() that hydrates index from persisted markdown - Add startup deduplication by learning.id (canonical over shared copies) - Update StoreConfig to include MarkdownStoreConfig - Align default paths with ProjectDirs recommendation - Add integration tests for restart durability and deduplication - Fix unreachable match arm for SharedLearningSub::Inject Refs #6

Handle missing learnings root as an empty store on first run and make startup deduplication deterministic by preferring canonical agent copies over shared replicas when IDs collide. Refs #6

Adds the agent template that the Phase 2 fan-out loop spawns when `pr-security-sentinel` appears in `[pr_dispatch.agents_on_pr_open]`. Mirrors the pr-spec-validator skeleton: - sonnet (subscription-only) — security-audit needs deeper reasoning than a structural diff scan to spot data-flow vulnerabilities and unsafe-code subtleties; haiku is the wrong tier - skill_chain = ["security-audit"] - 2-hour idempotency check via the existing comment-marker pattern (uses agent-scoped marker "Last security-audited commit: <sha>") - best-effort POST_STATUS helper mirroring build-runner.toml Path filter (early-exit happy path) — implemented in this script, NOT in the orchestrator: - skips the LLM call when no `Cargo.toml`, `Cargo.lock`, `crates/**/src/**`, or `**/secrets/**` paths changed - posts `success` (not no-post) with description "n/a — no security-relevant changes" because `adf/security` becomes a required check on `main` post-deploy; a never-resolved `pending` would block docs-only PRs forever Verdict parsing prefers an explicit `Verdict:` line; falls back to a `Risk:` line (critical/high → fail, medium → concerns, low → pass); finally falls back to `state=success` "audit posted; manual gate" when neither parses — so the commit status is never left in `pending` even on transient agent failures. The cron-scheduled `security-sentinel` agent stays in tree for full-repo audits; this PR-event variant is additive and diff-scoped. Operator follow-up after merge: 1. Rebuild + restart the orchestrator on bigbox so the new dispatch arm goes live. 2. PATCH branch_protections/main to add `adf/security` to required contexts. Refs #953

…2c) Refs #953 Rebased onto main after Phase 2b/d/e merges.

…open with path filter' (#1028) from task/953-adf-phase-2c-security-sentinel into main

…eMeta Refs #1026' (#1029) from sync/1026-query-role-meta into main

…eports for #1026' (#1030) from sync/vv2-1026 into main

…ams SDK tests Refs #1034

… #1012 - Intercept args before clap parsing to apply typo correction, alias expansion, and case-insensitive matching - Add build_cli_forgiving_parser() with actual CLI subcommands - Add apply_forgiving_parsing() heuristic for subcommand detection - Print correction notifications to stderr - Add 3 integration tests: auto-correction, alias expansion, case-insensitive matching

…PI Refs #1011 - Add Robot variant to Command enum with subcommands: capabilities, schemas, examples - Add RobotFormat enum (json, table, minimal) for --format flag - Add handle_robot_command() and format_robot_output() helpers - Wire Robot command in main() before catch-all server/offline dispatch - Add unreachable! arms in run_offline_command and run_server_command to satisfy exhaustive match (Robot is handled in main) - Add 4 integration tests: capabilities JSON, schemas JSON, examples text, all formats honoured - Fix pre-existing compilation errors in terraphim_orchestrator (config clone + concurrency_permit field)

- Add [Unreleased] section with commits since 1.17.0 - Generate doc gap report: 771 undocumented items across 46 crates - Generate API reference snippets from documented public APIs Refs #1046

- Change ContentBlock::ToolResult from is_error: bool to exit_code: i32 with backward-compatible deserialization for old JSON containing is_error - Add --include-pinned flag to search subcommand in terraphim_agent - Add --pinned flag to graph subcommand as kg list --pinned equivalent - Fix learn query to call query_all_entries when semantic=false, matching spec - Add pinned_node_ids to RoleGraphResponseDto for server/agent parity - Add get_role_graph_pinned to TuiService for offline pinned retrieval

- Add spec gaps fix entry to CHANGELOG (Refs #1040) - Generate documentation gap report for workspace crates - Generate API reference snippets for key public items - Update existing Gitea issue #1046 with recurrence findings Part of: #1046

…Refs #451 - Add hook_manager field to TerraphimAgent with HookManager initialized in ::new() - Add execute_llm_with_hooks() helper that wraps LLM calls with run_pre_llm/run_post_llm - Wire hooks into 5 agent command handlers: generate, answer, analyze, create, review - Wire hooks into specialized agents: ChatAgent and SummarizationAgent - Add unit tests verifying hook invocation (pre_llm, post_llm, block, modify decisions) - All existing tests pass, clippy clean, fmt clean Implements requirement from spec-validator remediation #442 to wire LLM hooks that were previously defined but not connected.

- Update [Unreleased] with 20 entries from 2026-04-27 to 2026-04-28 - Scan 58 crates: ~2,647 pub items missing documentation - Generate gap report at .docs/reports/documentation-gap-report-20260428.md - Recurrence comment on Gitea issue #1046 Refs #1046

Clippy fixes (blocking ADF build-runner on all PRs): - Remove needless borrows in exit_codes_integration_test.rs - Replace useless format! with .to_string() in offline_mode_tests.rs - Remove unused LearningStore import in learning.rs Test helper fixes (pre-existing failures): - Update review_pr_config_with_spec_fanout to push pr-spec-validator into pr_dispatch_per_project['alpha'] (per-project config takes precedence over top-level pr_dispatch) - Same fix for review_pr_config_with_security_checklist_fanout and review_pr_config_with_test_fanout Refs #238

…d handlers' (#1056) from task/451-llm-hooks-wiring into main

… test helpers' (#1058) from task/fix-clippy-and-test-helpers into main

…ebouncing Refs #815

…ebouncing Refs #815 - Track (path -> messages_len, Instant) per file in shared HashMap - Only emit when messages.len grew AND debounce window (~250ms) elapsed - Document dedup contract on SessionConnector::watch trait method - Fix clippy collapsible-if in emission guard - Integration test: 3 incremental writes produce at most 5 emissions

…ound assertion Refs #815

AlexMikhalev · 2026-04-30T12:56:50Z

security_checklist Summary

PR #815 (task/815-session-connector-dedup) introduces debounce-based deduplication for SessionConnector::watch() emissions in the native Terraphim AI connector.

Risk: low

Verdict: pass

Findings

Low — Mutex poison propagates as silent task failure
Location: crates/terraphim_sessions/src/connector/native.rs:206

let mut guard = last_seen.lock().unwrap();

std::sync::Mutex::lock().unwrap() is called inside a tokio::spawn(async move { ... }) block. The pattern is otherwise correct (no .await held across the lock guard, so no async-mutex is needed), but if the mutex becomes poisoned (a Exit Classes inside the critical section in another spawned task), subsequent tasks will Exit Classes and be silently dropped by the Tokio runtime. The event is lost without any log message.

The critical section cannot be triggered by external input — paths originate from the notify fileSystem watcher constrained to the configured directory — so this is not exploitable. The risk is limited to silent event loss under abnormal process state.

Recommendation: replace .unwrap() with .unwrap_or_else(|e| e.into_inner()) (recover from poisoned state) or add a tracing warn before the unwrap for observability.

Informational — Unbounded HashMap accumulation
Location: crates/terraphim_sessions/src/connector/native.rs:173

last_seen: HashMap<PathBuf, (usize, Instant)> grows without eviction for the lifetime of the watcher. In a directory tree with many session files this is benign; in a long-running process watching directories subject to large-scale file creation (e.g. CI farms), it could accumulate memory without bound. Not exploitable. Consider periodic pruning of entries older than a configurable TTL if watch lifetimes are expected to be long.

Informational — Path extension check without canonicalisation
Location: crates/terraphim_sessions/src/connector/native.rs:193

if path.extension().is_some_and(|ext| ext == "jsonl")

The .jsonl extension filter is correct for the intended domain. Paths are supplied by the notify crate (constrained to RecursiveMode::Recursive under the configured directory), not by user input, so there is no path traversal surface here. No action required.

Dependency Changes

None. Cargo.toml and Cargo.lock are unchanged in this PR. No new crates introduced, no version bumps.

Scope of Changes

The remaining diff is purely additive documentation (rustdoc on enum variants, #![allow(missing_docs)] on binary crates, struct field doc-comments). No logic changes outside the connector debounce implementation and its integration test.

No unsafe blocks introduced or modified.
No secrets, tokens, or credentials touched.
No new Graph surface, serialisation boundary, or auth path added.
Existing .jsonl-only file filter preserved; no new file types admitted.

_{Last security_checklist-audited commit: 971b3e0}

AlexMikhalev · 2026-04-30T13:07:56Z

Security Audit Summary

PR #858 introduces debounce/dedup logic to NativeClaudeConnector::watch() and adds Rustdoc to several crates. No Cargo.toml or Cargo.lock changes; no new dependencies.

Risk: low

Verdict: pass

Findings

[LOW] Mutex poison propagation in production watch loop
Location: crates/terraphim_sessions/src/connector/native.rs — tokio::spawn closure inside the spawn_blocking loop

let mut guard = last_seen.lock().unwrap();

std::sync::Mutex::lock() returns Err(PoisonError) when a previous thread panicked while holding the guard. Calling .unwrap() on the Err variant re-panics the current tokio::spawn task, silently dropping it. The watcher thread itself continues; the symptom is corrupted dedup state (stale entries not updated) rather than a crash. Not externally exploitable.

Recommendation: recover from poisoned state:

let mut guard = last_seen.lock().unwrap_or_else(|e| e.into_inner());

[INFORMATIONAL] tokio::spawn inside spawn_blocking
Location: same file, watch loop body

Valid — the runtime handle is accessible from blocking threads. Lock guard is dropped before any .await boundary, so no deadlock risk.

Dependency Changes

None. Cargo.toml and Cargo.lock unchanged.

Scope not affected

No unsafe blocks introduced.
No hardcoded credentials, tokens, or secrets.
No path traversal: events filtered to .jsonl extension before parsing.
No command injection vectors.
No new serialisation trust boundaries.
#[allow(missing_docs)] is lint suppression on test/validation infrastructure only.

_{Last security-audited commit: 971b3e0}

AlexMikhalev · 2026-04-30T13:09:27Z

Requirements Traceability Summary

PR #858 — Fix #815: deduplicate SessionConnector::watch() emissions with debouncing

Branch: task/815-session-connector-dedup | Author: AlexMikhalev | Head: 971b3e0

Scope: crates/terraphim_sessions/src/connector/{mod,native}.rs (core dedup logic) + rustdoc additions for terraphim_agent, terraphim-session-analyzer, terraphim_validation.

Verdict: pass

Traceability Matrix

Req ID	Requirement	Design Ref	Impl Ref	Tests	Evidence	Status
REQ-001	`watch()` MUST NOT emit duplicate sessions on each raw `Modify` FS event (inotify/FSEvents fires on every line flush)	Dedup contract docstring on `SessionConnector::watch` trait — `connector/mod.rs:124-142`	`native.rs:169-237` — `HashMap<PathBuf,(usize,Instant)>` tracking + `debounce_duration=200ms` guard	`native.rs:646` `test_watch_dedup_on_incremental_writes`	`cargo test -p terraphim_sessions` → 54/54 pass	✅
REQ-002	Emit only when `messages.len` grew AND debounce window (≈250 ms) elapsed since last emission for that path	Same docstring — canonical strategy specified	`native.rs:206-229` `should_emit` boolean guard; `(messages_len > *last_len) && duration_since >= debounce_duration`	Same test — 3 spaced writes, assert `1 ≤ emissions ≤ 4`	Test passes with inotify on Linux	✅
REQ-003	Dedup contract MUST be documented on the `SessionConnector::watch` trait method for implementors	Self-referential — docstring IS the design artefact	`connector/mod.rs:128-137` `Dedup contract:` block added	No separate test; contract verifiable via trait doctest	`cargo doc -p terraphim_sessions` clean	✅
REQ-004	`NativeClaudeConnector` MUST expose a `with_path(PathBuf)` constructor for test isolation (avoids `~/.claude` coupling)	Inferred from test strategy — no ADR	`native.rs:26-35` `with_path()` + `override_path: Option<PathBuf>` field	`native.rs:646` test uses `NativeClaudeConnector::with_path(dir.path())`	54/54 pass	✅
REQ-005	Receiver drop detection — watcher loop MUST exit cleanly when channel closed	Inferred from resource management	`native.rs:180-184` `tx.is_closed()` check in poll loop	No dedicated test; covered implicitly (test drop at end)	Acceptable — resource leak guard, not observable correctness	⚠️
NFR-001	No regression in existing 53 tests	Existing suite	`native.rs` all existing tests updated from `NativeClaudeConnector` (unit struct) to `NativeClaudeConnector::default()`	54/54 pass (`cargo test -p terraphim_sessions`)	Verified locally	✅
NFR-002	Clippy clean	—	`cargo clippy -p terraphim_sessions`	—	Clean (0 warnings/errors)	✅
DOC-001	Missing rustdoc on 30+ public items across `terraphim_agent`, `terraphim-session-analyzer`, `terraphim_validation` (Refs #1089)	—	`crates/terraphim_agent/src/lib.rs`, `repl/commands.rs`, `repl/handler.rs`, `robot/*.rs`, `service.rs`; session-analyzer `connectors/mod.rs`, `reporter.rs`; `terraphim_validation/src/lib.rs`	`#[allow(missing_docs)]` suppressions in test infra	Rustdoc warnings absent; `cargo doc` passes	✅

Gaps

⚠️ REQ-005 — no explicit close-detection test: The tx.is_closed() guard is exercised only implicitly when the test's receiver is dropped. A dedicated test that drops the receiver mid-watch and verifies the spawned thread exits within a bounded time would make the invariant explicit. This is a follow-up, not a blocker — the implementation is correct.
ℹ️ No ADR for dedup strategy: The connector/mod.rs docstring documents the contract well, but an ADR in docs/adr/ would capture the decision context (why debounce over byte-offset tracking). The inline docs are sufficient for merge; an ADR is optional.
ℹ️ Commits 67654c0 and 0ceff5c are both prefixed fix(sessions): deduplicate… — the second (0ceff5c) is a fixup of the first (clippy + test tightening split into a third commit 375db14). Consider squashing before merge to keep git history clean.
ℹ️ Extra commits on branch (rustdoc for #1089, reports/, agent changes) are present in the diff but unrelated to the dedup fix. If the PR is intended solely for feat(learn): compile captured corrections into live thesaurus #815, these should be on a separate branch. If intended as a combined cleanup PR, the title and body should reflect that scope.

_{Last spec-validated commit: 971b3e0}

AlexMikhalev · 2026-04-30T13:24:17Z

Requirements Traceability Summary

PR #858 (task/815-session-connector-dedup) implements deduplication of SessionConnector::watch() emissions in NativeTerraphim AIConnector. The change introduces a HashMap<PathBuf, (usize, Instant)> debounce state with a 200 ms window, replaces the blocking for res in watcher_rx loop with a recv_timeout loop that polls for receiver-drop every 500 ms, and adds an injectable path override for test isolation.

Spec sources: PR #858 description, commit messages Refs #815, trait-level rustdoc on SessionConnector::watch().

Verdict: concerns

Core requirement implemented and verified. Three follow-up items noted — none are release blockers, but the rustdoc/code mismatch and the missing shutdown test are worth tracking.

Traceability Matrix

Req ID	Requirement	Design Ref	Impl Ref	Tests	Status
REQ-815-01	`watch()` MUST NOT emit duplicate sessions for a single file modified multiple times via FS events	Trait rustdoc dedup contract on `SessionConnector::watch()` in `mod.rs`	`native.rs` — `last_seen` HashMap + `should_emit` guard combining `messages_len > last_len` and debounce check	`test_watch_dedup_on_incremental_writes` — upper bound `received.len() <= 4`	✅
REQ-815-02	Debounce window of ≈250 ms; emit only after window elapses since last emission per path	Trait rustdoc ("≈250 ms" in contract description)	`debounce_duration = Duration::from_millis(200)`	`test_watch_dedup_on_incremental_writes` — writes spaced 400 ms apart, each gap exceeds window	⚠️ 200 ms vs ≈250 ms doc mismatch
REQ-815-03	Emit only when `messages.len` has grown since last emission for that path	Trait rustdoc (canonical strategy)	`messages_len > last_len && now.duration_since(last_time) >= debounce_duration`	`test_watch_dedup_on_incremental_writes` asserts `last.messages.len() == 3`	✅
REQ-815-04	At least one emission must be received per session (lower bound)	PR description ("lower-bound assertion")	Same guard — `None` arm (first time seeing file) always emits	`test_watch_dedup_on_incremental_writes` asserts `!received.is_empty()`	✅
REQ-815-05	Watch loop exits cleanly when all receivers are dropped	Implicit shutdown-safety requirement	`if tx.is_closed() { break; }` checked each 500 ms `recv_timeout` cycle	No dedicated test for the shutdown path	⚠️
NFR-815-01	`NativeTerraphim AIConnector` must accept a path override for test isolation	Implicit (test harness requirement)	`NativeTerraphim AIConnector::with_path(PathBuf)` constructor; `override_path: Option<PathBuf>` field; `default_path()` now uses `override_path.or_else(...)`	`test_watch_dedup_on_incremental_writes` uses `with_path`	✅
NFR-815-02	`last_seen` HashMap must not grow without bound in long-running sessions	Not documented	No eviction strategy — HashMap grows with each unique path seen	No test	⚠️

Gaps

⚠️ Follow-up: Rustdoc says ≈250 ms; code uses 200 ms

SessionConnector::watch() trait doc states "≈250 ms" debounce window. native.rs uses Duration::from_millis(200). Future connector implementors will follow the doc, not the code, creating silent divergence.

Recommendation: align rustdoc to "≈200 ms" or extract a named constant and reference it in the doc.

⚠️ Follow-up: No test for clean shutdown (REQ-815-05)

The tx.is_closed() → break path is exercised implicitly when the integration test drops rx at the end, but there is no assertion that the watcher thread terminates within a bounded time. For a long-running Service this matters.

Recommendation: add a test that drops rx immediately after watch() returns, then waits ≤1 s and verifies the bacGraphround thread (or channel) has gone quiescent.

⚠️ Follow-up: Unbounded last_seen HashMap (NFR-815-02)

The debounce map grows proportionally with unique paths observed and is never pruned. Benign for typical Terraphim AI project directories (tens to hundreds of files), but worth a bounded structure (LRU-1000) for correctness at scale.

Recommendation: open a follow-up issue; not a blocker for this PR.

ℹ️ Note: Two duplicate commits carry identical messages

SHAs 0ceff5c22 and 67654c075 both carry fix(sessions): deduplicate SessionConnector::watch() emissions with debouncing Refs #815. Likely a rebase artefact. Worth squashing before merge to keep the Graph clean.

_{Last spec-validated commit: 971b3e0}

AlexMikhalev · 2026-04-30T14:47:24Z

Requirements Traceability Summary

PR #858 fixes a session-watch deduplication defect in NativeTerraphim AIConnector. The change is bounded to crates/terraphim_sessions/src/connector/ — no plans/ spec files were modified (acceptable for a focused bugfix with no new external API surface).

Referenced issue cross-System confusion: PR title cites #815, but GitHub issue #815 is feat(learn): compile Terraphim Graph Embeddings: Learning Agent Guidetured corrections into live Thesaurus — an entirely different feature. The dedup requirement lives only in the PR description itself. This is a traceability gap (INFERRED requirements, no stable ID in a ticket System that matches the code change).

Verdict: concerns

Traceability Matrix

Req ID	Requirement	Design Ref	Impl Ref	Tests	Status
REQ-001	`watch()` must not emit duplicate sessions on every raw Modify event	PR description + trait doc `connector/mod.rs:124-135`	`native.rs:174-238` debounce loop	`test_watch_dedup_on_incremental_writes`	pass
REQ-002	Emit only when `messages.len` has grown since last emission	PR description	`native.rs:203-215`	same test	pass
REQ-003	Debounce window ~250 ms gates burst emissions	Trait doc "~250 ms"	`native.rs:197` `Duration::from_millis(200)`	same test (400 ms gaps)	concerns
REQ-004	`SessionConnector::watch` trait must document the dedup contract	PR description	`connector/mod.rs:124-135`	N/A	pass
REQ-005	Connector must be testable with a custom path	PR description	`native.rs:22-31` `override_path` + `with_path()`	All watch tests use `with_path`	pass
NFR-001	Watch loop exits cleanly when receiver is dropped	Implicit	`native.rs:174` `tx.is_closed()` + `native.rs:237` Disconnected arm	Implicit via test scope exit	concerns
NFR-002	Dedup state not shared across connector instances	Implicit correctness	`last_seen` scoped to spawn_blocking closure	No multi-instance isolation test	concerns

Gaps

concerns REQ-003 — Debounce constant inTerraphim Graph Embeddings: Learning Agent Guide
Trait doc says ~250 ms; implementation uses Duration::from_millis(200). The "approximately" qualifier provides some cover, but the values should be aligned. Recommend extracting a named constant const DEBOUNCE_MS: u64 = 250 referenced in both doc and code, or updating the doc to say ~200 ms. One-line follow-up.

concerns NFR-001 — No explicit receiver-drop test
The tx.is_closed() guard is present and correct. The integration test implicitly exercises this path when rx drops at scope end. However, the spawn_blocking thread may outlive the test by up to the 500 ms recv_timeout window, which can cause noise in test runners reporting lingering tasks. A test_watch_stops_when_receiver_dropped test with tokio::time::timeout would make the invariant explicit. Follow-up, not blocking.

concerns NFR-002 — No multi-instance isolation test
The dedup HashMap is scoped inside the closure so two concurrent connectors will not interfere — correct by construction. If a watch_all() aggregator is ever added, this isolation assumption should be re-verified under test. Note, not blocking.

note — Issue cross-System mismatch
PR cites Refs #815, but GitHub #815 is feat(learn): compile Terraphim Graph Embeddings: Learning Agent Guidetured corrections into live Thesaurus. The dedup requirement exists only in the PR description. Recommend creating a Gitea issue anchoring the dedup requirement and updating the PR body to reference it.

_{Last spec-validated commit: HEAD (PR #858 task/815-session-connector-dedup)}

AlexMikhalev · 2026-05-01T20:19:53Z

Requirements Traceability Summary

PR: #858 — Fix #815: deduplicate SessionConnector::watch() emissions with debouncing
Validator: Carthos (Domain Architect)
Date: 2026-05-01
Head SHA: pr-858 (fetched from origin pull/858/head)

This PR is tightly scoped to two files:
crates/terraphim_sessions/src/connector/mod.rs — trait documentation
crates/terraphim_sessions/src/connector/native.rs — watch loop rewrite

The change replaces an unbounded event-loop with a debounced, closed-channel-aware loop that tracks (path → messages.len, last_emission_time) to suppress duplicate emissions on inotify/FSEvents flush notifications.

Verdict: concerns

The core dedup invariant is correctly implemented and the integration test test_watch_dedup_on_incremental_writes provides meaningful coverage. Three concerns prevent a clean pass:
(1) The debounce duration is 200 ms in code (Duration::from_millis(200)) but the PR description and trait doc say 250 ms — a documentation/code inconsistency that should be resolved before merge;
(2) The integration test uses real tokio::time::sleep calls (total ~2.3 s of sleep per run), not tokio::time::pause, which will slow CI and is fragile on loaded runners — tokio::time::pause is the idiomatic approach for time-dependent tests in the project;
(3) No ADR or design note justifies the choice of messages.len as the dedup key over byte-offset or session hash — future maintainers will not understand why a growing-but-resetting message count (e.g., session file truncation) would trigger a spurious emission.

Traceability Matrix

Req ID	Requirement	Design Ref	Impl Ref	Tests	Evidence	Status
REQ-1	watch() must not emit duplicate sessions on rapid FS events (inotify flush storm)	No ADR — MISSING	`native.rs` — `last_seen: Arc<Mutex<HashMap<PathBuf,(usize,Instant)>>>` + emit guard (lines ~182–220)	`test_watch_dedup_on_incremental_writes`	PR test plan: 54 tests pass; dedup test asserts count ≤ 4 with lower-bound	⚠️
REQ-2	Dedup strategy: emit only when messages.len grew AND debounce window elapsed	PR description §bullet 2; trait doc `mod.rs`	`native.rs` — `messages_len > last_len && now.duration_since(last_time) >= debounce_duration`	`test_watch_dedup_on_incremental_writes`	Integration test covers both conditions; timing relies on real sleep	⚠️
REQ-3	Watch loop must exit cleanly when receiver is dropped (no zombie spawned tasks)	Implicit from CLAUDE.md async guidelines	`native.rs` — `if tx.is_closed() { break }` at loop start; `Disconnected` arm on `recv_timeout`	`test_watch_dedup_on_incremental_writes` (rx dropped at end)	Test drops rx; no assertion on clean exit but watcher thread terminates via Disconnected arm	⚠️
REQ-4	NativeClaudeConnector must accept a custom path for test isolation	PR description §bullet 1 (implied)	`native.rs` — `override_path: Option<PathBuf>` field; `with_path(path: PathBuf)` constructor	All existing tests updated to `::default()`; new test uses `::with_path(dir.path())`	Mechanically correct; all existing tests compile with `default()`	✅
REQ-5	Dedup contract documented on SessionConnector::watch() trait method	PR description §bullet 3	`mod.rs` — doc comment "Dedup contract" block on `watch()`	n/a (documentation)	Doc comment present; 200ms/250ms inconsistency flagged below	⚠️
REQ-6	clippy collapsible_if fixed in emission guard	PR description §bullet 4	`native.rs` — guard merged into single `if should_emit && tx.send...`	`cargo clippy -p terraphim_sessions` in test plan	Stated clean; not independently verified here	✅
REQ-7	Integration test lower-bound assertion added; upper bound narrowed from 5 → 4	PR description §bullet 5	`native.rs` — test asserts `count >= 1 && count <= 4`	`test_watch_dedup_on_incremental_writes`	Improvement over prior unasserted count; see timing concern below	⚠️

Gaps

⚠️ Debounce constant inconsistency (follow-up, merge-blocker candidate)

debounce_duration = Duration::from_millis(200) in native.rs but the trait doc and PR description state "≈250 ms". The trait doc forms part of the public contract (mod.rs). Callers reading the doc will expect 250 ms headroom; the implementation fires at 200 ms. One-line fix: align doc to 200 ms or change constant to 250 ms and update the test sleeps accordingly.

⚠️ Real-time sleeps in tests (follow-up)

test_watch_dedup_on_incremental_writes accumulates ~2.3 s of tokio::time::sleep per run. The project's CLAUDE.md states: "Use tokio::time::pause for testing time-dependent code without real delays." The test does not use tokio::time::pause, making it slow and potentially flaky on CI runners under load. Recommended fix: redesign the test to write via a mock clock, or use tokio::time::pause with tokio::time::advance. If real FS events are required (cannot mock), document the reason explicitly in the test.

⚠️ No ADR for dedup key choice (follow-up)

messages.len as the dedup key is reasonable for append-only JSONL files, but:

If a session file is truncated and rewritten (e.g., compaction), messages.len resets and the old baseline is stale — the guard would suppress the new content until messages.len exceeds the old high-water mark.
Alternative keys (byte-offset, session hash) would detect truncation correctly.

No design note captures this trade-off. Recommend a one-paragraph comment in native.rs at the last_seen declaration explaining the key invariant and the truncation limitation. This does not block merge but will cause confusion at the next modification.

ℹ️ Referenced issue #815 is a merged PR, not a dedup issue

gh issue view 815 resolves to "feat(learn): compile captured corrections into live thesaurus" (merged PR). The session-watch dedup requirement source is therefore untraceable on GitHub. If #815 refers to a Gitea issue (which is currently 404), this cross-reference is broken. Recommend updating the PR title/description to reference the actual tracked requirement (Gitea issue or GitHub issue), or adding a "Background" section in the PR body.

_{Last spec-validated commit: pr-858}

AlexMikhalev and others added 30 commits April 21, 2026 10:25

Merge pull request 'Fix adf-fleet#34: ROC v1 Step F - verdict polling…

b36be90

… + AutoMerge enqueue' (#717) from task/roc-v1-step-f-verdict-polling into main

[agent] feat(tracker): GiteaTracker::merge_pull + GiteaMergeResult — …

7a2cc45

…Refs adf-fleet#35

[agent] feat(orchestrator): AutoMergeExecutor trait + GiteaPrTracker …

380763e

…impl — Refs adf-fleet#35

[agent] feat(orchestrator): handle_auto_merge (defensive re-check + P…

041c75f

…ostMergeTestGate enqueue + [ADF] issue on failure) — Refs adf-fleet#35

[agent] feat(orchestrator): wire AutoMerge arm in reconcile_tick to h…

5777b2d

…andle_auto_merge — Refs adf-fleet#35

Merge pull request 'Fix #35: AutoMerge handler (ROC v1 Step G)' (#719)…

680a93a

… from task/roc-v1-step-g-auto-merge-handler into main

Merge pull request 'Fix #36: post-merge test gate (ROC v1 Step H)' (#724

1669c3f

) from task/roc-v1-step-h-post-merge-gate into main

Merge pull request 'Fix #37: Quickwit event tags for auto-review/merg…

d34245f

…e flow (ROC v1 Step I)' (#726) from task/roc-v1-step-i-quickwit-events into main

Merge pull request 'Fix #39: cron cadence doc + developer-tier */20 b…

ada1f44

…ump (ROC v1 Step K)' (#727) from task/roc-v1-step-k-cron-extension into main

[agent] docs(runbooks): ROC v1 staged rollout -- digital-twins -> odi…

fd7de5f

…lo -> terraphim -- Refs adf-fleet#40 Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>

Merge pull request 'Fix #40: ROC v1 rollout runbook (ROC v1 Step L)' (#…

5c94041

…728) from task/roc-v1-step-l-rollout-runbook into main

Merge pull request 'feat(orchestrator): bump kimi-for-coding registra…

2384688

…tion to k2p6' (#734) from task/kimi-k2p6-provider into main

docs: add handover for 2026-04-20 security fixes session Refs #630

2528274

chore: bump version to 1.16.35 for release

aebf67c

fix(terraphim_agent): utf-8 safe snippet truncation (#824)

51b1c49

Cherry-pick from local main (0115f06f). Refs #612 Co-authored-by: Test User <test@test.com>

Merge pull request 'feat(orchestrator): restore openai to C1 allow-li…

f8059d1

…st + KG routing' (#737) from task/kg-planning-tier-k2p6 into main

feat(sessions): implement ClineConnector for VS Code extension sessions

2e205b3

feat(agent): implement LearningInjector and learn inject CLI command

f23338e

fix(agent): harden shared learning startup hydration

b0f5253

Handle missing learnings root as an empty store on first run and make startup deduplication deterministic by preferring canonical agent copies over shared replicas when IDs collide. Refs #6

Test User and others added 23 commits April 27, 2026 17:52

test(orchestrator): cover pr-security-sentinel on PR open (ADF Phase …

ef3870f

…2c) Refs #953 Rebased onto main after Phase 2b/d/e merges.

feat(robot): add query and role fields to ResponseMeta Refs #1026

3e3ce04

Merge pull request 'feat(adf): Phase 2c — pr-security-sentinel on PR …

de6c541

…open with path filter' (#1028) from task/953-adf-phase-2c-security-sentinel into main

Merge remote-tracking branch 'gitea/main'

69f6e4b

Merge pull request 'feat(robot): add query and role fields to Respons…

4b5ef3c

…eMeta Refs #1026' (#1029) from sync/1026-query-role-meta into main

docs: add Phase 4 verification and Phase 5 validation reports for #1026

b171add

Merge pull request 'docs: Phase 4 verification + Phase 5 validation r…

c470034

…eports for #1026' (#1030) from sync/vv2-1026 into main

feat: implement #1034 - msteams-sdk-test.mjs with comprehensive MS Te…

cd10835

…ams SDK tests Refs #1034

docs: update CHANGELOG and generate documentation gap report

bb166b0

- Add [Unreleased] section with commits since 1.17.0 - Generate doc gap report: 771 undocumented items across 46 crates - Generate API reference snippets from documented public APIs Refs #1046

docs(adf): add operations guide and blog post for PR fan-out deployment

f84a364

Merge pull request 'Fix #451: Wire LLM pre/post hooks in agent comman…

bd263aa

…d handlers' (#1056) from task/451-llm-hooks-wiring into main

Merge pull request 'fix(clippy): resolve warnings and fix PR dispatch…

ac188b7

… test helpers' (#1058) from task/fix-clippy-and-test-helpers into main

fix(sessions): deduplicate SessionConnector::watch() emissions with d…

67654c0

…ebouncing Refs #815

test(sessions): tighten dedup integration test bounds and add lower-b…

375db14

…ound assertion Refs #815

AlexMikhalev force-pushed the main branch from 0185b51 to df345a8 Compare May 1, 2026 17:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #815: deduplicate SessionConnector::watch() emissions with debouncing#858

Fix #815: deduplicate SessionConnector::watch() emissions with debouncing#858
AlexMikhalev wants to merge 2266 commits intomainfrom
task/815-session-connector-dedup

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented Apr 30, 2026

Uh oh!

AlexMikhalev commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AlexMikhalev commented Apr 30, 2026

Summary

Test plan

Uh oh!

AlexMikhalev commented Apr 30, 2026

security_checklist Summary

Risk: low

Verdict: pass

Findings

Dependency Changes

Scope of Changes

Uh oh!

AlexMikhalev commented Apr 30, 2026

Security Audit Summary

Risk: low

Verdict: pass

Findings

Dependency Changes

Scope not affected

Uh oh!

AlexMikhalev commented Apr 30, 2026

Requirements Traceability Summary

Verdict: pass

Traceability Matrix

Gaps

Uh oh!

AlexMikhalev commented Apr 30, 2026

Requirements Traceability Summary

Verdict: concerns

Traceability Matrix

Gaps

Uh oh!

AlexMikhalev commented Apr 30, 2026

Requirements Traceability Summary

Verdict: concerns

Traceability Matrix

Gaps

Uh oh!

AlexMikhalev commented May 1, 2026

Requirements Traceability Summary

Verdict: concerns

Traceability Matrix

Gaps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant